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CONTEIBUTIONS  TO  THE  THEORY  OE 
ACCIDENT  PRONENESS 

II.  TRUE  OR  FALSE  CONTAGION 

BY 

GRACE  E.  BATES  AND  JERZY  NEYMAN 

1.  Introduction.  The  first  part  of  the  present  paper  [1]^  was  concerned  with  the 
theoretical  aspect  of  the  following  practical  question:  can  one  use  the  number  of 
light  accidents  incurred  by  different  individuals  in  the  past  to  predict  the  number  of 
severe  accidents  in  a  hazardous  occupation  to  be  sustained  in  the  future?  The  theo¬ 
retical  assumptions  underlying  this  study  form  an  extension  of  the  well-known 
scheme  due  to  Greenwood,  Yule,  and  Newbold,  An  essential  part  of  this  scheme  is 
characterized  by  the  postulates :  (i)  that  the  individuals  of  a  population  differ  from 
each  other  in  accident  proneness,  {it)  that  the  accidents  already  incurred  do  not 
change  the  probabilities  of  further  accidents  in  the  future,  and  (in)  that  these  prob¬ 
abilities  stay  constant  in  time  and  are  not  modified  by  the  experience  that  the  indi¬ 
vidual  may  gain  in  the  particular  occupations.  These  three  postulates  may  be 
symbolized  by  the  combined  term  “mixture-no  contagion-no  time-effect  model. 
In  order  to  be  able  to  deal  with  two  kinds  of  accidents,  light  and  severe,  the  above 
three  postulates  were  supplemented  by  two  more;  (iv)  that  the  expected  number 
of  light  accidents  per  unit  of  time  is  proportional  to  the  expected  number  X  of  severe 
accidents  (this  postulate  was  termed  the  fundamental  hypothesis),  and  (y)  that  to 
each  severe  accident  there  corresponds  a  fixed  probability  6  that  the  individual 
involved  in  the  accident  will  survive. 

The  present  Part  II  of  the  paper  deals  with  a  comparison  between  the  foregoing 
scheme  of  mixture-no  contagion-no  time-effect  and  an  alternative  scheme  due  to 
Polya  [2],  P61ya^s  scheme  postulates  (I)  identity  of  the  individuals  with  respect  to 
accident  proneness — thus  (I)  is  the  denial  of  postulate  (i) — ,  (II)  possible  presence 
of  contagion  of  a  specified  type,  and  (III)  possible  effect  of  experience  gained  since 
entering  the  particular  occupation.  Curiously,  as  discussed  by  Lundberg  [3]  and 
Feller  [4  and  5],  if  one  considers  the  number  of  accidents  incurred  in  a  single  period 
of  time,  its  distribution  implied  by  the  mixture-no  contagion-no  time-effect  model 
coincides  with  the  distribution  implied  by  the  Polya  contagious  scheme  with  the 
additional  assumption  that  prior  to  the  period  of  observation  all  the  individuals 
concerned  had  the  same  number  of  accidents. 

Naturally,  no  mathematical  model  of  actual  phenomena  is  ever  absolutely  exact. 
However,  it  is  an  undeniable  fact  that  some  models  fit  the  particular  set  of  phe¬ 
nomena  better  than  some  others.  In  the  present  case  a  number  of  problems  concerned 
with  personnel  management  make  it  important  to  distinguish  between  accidents 
which  do  and  those  which  do  not  show  elements  of  contagion  in  the  sense  of  Polya 
and  between  those  in  which  the  time  elapsed  since  entering  the  hazardous  occupa- 

The  work  on  this  paper,  begun  under  contract  with  the  School  of  Aviation  Medicine,  was  com¬ 
pleted  with  the  partial  support  of  the  Office  of  Naval  Research.  Dr.  Bates,  a  member  of  the 
faculty  of  Mount  Holyoke  College,  worked  at  the  University  of  California  on  this  project, 

^  Numbers  in  brackets  refer  to  references  at  the  end  of  the  paper. 
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tion  has  some  effect  or  no  effect  on  the  probability  of  accidents.  This  is  just  the 
problem  studied  in  the  following  pages. 

Using  the  scheme  of  Polya  in  a  slightly  generalized  form  we  deduce  the  multi¬ 
variate  distribution  of  the  numbers  of  accidents  of  the  same  severe  type  incurred  and 
survived  in  several  successive  periods  of  observation  and  of  the  number  of  these 
periods  that  are  survived  by  the  particular  individual.  This  distribution  is  then 
compared  with  a  corresponding  distribution  implied  by  the  mixture-no  contagion- 
no  time-effect  scheme.  It  is  shown  that,  just  as  soon  as  the  accidents  are  observed 
in  more  than  one  period,  equivalence  of  the  two  distributions  forces  a  condition  on 
the  parameters  of  the  generalized  Polya  scheme  so  that,  barring  an  exceptional  par¬ 
ticular  case,  it  is  possible  to  distinguish  between  the  two  models. 

As  a  by-product  of  this  study  we  obtain  the  joint  distributions  of  the  number  of 
light  accidents  incurred  in  the  past  and  of  the  number  of  severe  accidents  to  be 
incurred  in  the  future,  as  implied  by  the  postulates  (^)  through  (v),  which  were  given 
in  Part  I  without  proof.  These  distributions  are  deduced  separately  for  those  indi¬ 
viduals  who  survive  all  the  severe  accidents  incurred  and  separately  for  those  who 
succumb. 

In  the  last  section  we  outline  what  seems  to  be  a  more  promising  method  of  ap¬ 
proach  to  the  problem  of  establishing  the  presence  of  contagion.  This  is  based  on  the 
study  of  the  distribution  of  time  intervals  between  successive  accidents  incurred 
by  particular  individuals. 

2.  Basic  assumptions.  We  shall  consider  an  individual  1  who,  from  a  certain 
moment  ^  =  0  is  exposed  to  the  risk  of  accidents  of  a  specified  kind,  subject  to  five 
postulates  formulated  below.  The  totality  of  these  postulates  will  be  denoted  by 
(P)  and  described  as  the  generalized  Polya  contagious  scheme  or  model.  In  formu¬ 
lating  these  postulates,  it  will  be  necessary  to  consider  time  intervals  (0,  Tf)  and 
(Pi,  Tf)  with  0  <  Pi  <  P2.  These  intervals  will  be  always  considered  open  on  the 
left  and  closed  on  the  right,  say  0  <  Pi  and  Pi  <  i  ^  P2,  where  t  stands  for  a 
moment  in  time. 

Postulate  Pi.  The  individual  I  cannot  die  or  otherwise  cease  to  he  exposed  to  acci¬ 
dents  except  as  a  result  of  an  accident  which  may  prove  fatal. 

Postulate  P2.  Whatever  the  time  interval  (Pi,  P2)  with  0  ^  Pi  <  P2,  if  the  indi¬ 
vidual  I  is  alive  at  Pi,  the  number  of  accidents^  say  X(P3,  P2),  that  he  will  incur  and 
survive  in  (Pi,  P2)  is  a  random  variable  whose  distribution  depends  on  Pi  and  P2  and  on 
the  number  of  accidents  incurred  in  the  time  interval  (0,  Pi),  but  not  on  the  precise  times 
when  these  accidents  took  place. 

Accordingly,  we  shall  consider  probabilities  Pm.n{Ti,  P2)  and  Qm,n{Ti,  Tf)  defined 
as  follows: 

Pm,n(Pi,  P2)  is  the  conditional  probability  that  during  the  time  interval  (Pi,  Tf) 
the  individual  /  will  incur  exactly  n  accidents  and  that  he  will  survive  them  all, 
given  that  at  time  Pi  he  had  incurred  exactly  m  accidents  and  survived.  If  Pi  =  0, 
then  the  only  acceptable  value  of  m  will  be  m  =  0. 

Qm,n{Ti,  P2)  is  the  conditional  probability  that  during  the  time  interval  (Pi,  P2) 
the  individual  /  will  incur  exactly  n  +  1  accidents,  that  he  will  survive  the  first  n 
and  die  in  the  {n  +  l)st,  given  that  at  time  Pi  he  had  incurred  exactly  m  accidents 
and  survived.  Again,  if  Pi  =  0  then  the  only  possible  value  of  m  will  be  m  =  0. 
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Obviously, 

(1)  Z  {Pn..n  (Tu  T2)  +  Q„,„(  Tl,  T2))  ^  1  . 

Postulate  P3.  //  P2  Pi  all  the  'probabilities  P2)  anc?  Qw,n(Pi,  P2) 

converge  to  limits  P„,,n(Pi,  Pi)  and  Qm,niTi,  Pi),  respectively. 

More  specifically, 

(2)  P..o(Pi,  Pi)  =  1 

for  every  m  and,  consequently,  owing  to  (1) 

(3)  Pm,n{Tlj  Pi)  =  Qm,n-l(Pl,  Pi)  =  0 


for  n  ^  1.  The  limits  thus  postulated  will  be  interpreted  as  the  probabilities  of  n 
accidents,  all  survived  or  not,  occurring  in  the  interval  of  time  of  zero  duration. 
Postulate  P4.  To  each  accident  there  corresponds  a  fixed  probability  B  of  surviving  it. 
Consequently, 

(4)  Qm,o(Pl,  P2)  =  (1  ~  0)  [1  —  Pm,o(Plj  P2)  ]  . 

Remark:  A  superficial  examination  of  the  problem  may  suggest  that  instead  of  (4) 
the  probability  Qm,o(Pi,  P2)  equals 

(5)  (1  -  0)[P„a(Pi,  P2)  +  Qm,o(T^,  P2)]  . 


However,  the  reader  will  easily  satisfy  himself  that  the  presumption  is  false  because 
it  does  not  take  into  account  the  circumstance  that  with  the  moment  of  a  fatal 
accident  the  individual  I  ceases  to  be  exposed  to  accidents  which  might  have  other¬ 
wise  occurred  after  this  accident. 

Postulate  P5.  At  least  at  P2  =  Pi,  the  probabilities  Pm,n(P],  P2)  and  Qm,n(P],  P2) 
are  differentiable  with  respect  to  P2  and,  specifically, 

/n\  6P m,  0  (Pi ,  P 2)  —  ^ 

dP2  y,  ^  1  +  l^Pl  ^ 


where  X,  11  and  v  are  nonnegative  constants,  and 


(7) 


bPm,  n  (Pi; 

dT2 


P2) 


BQm.  n-1  (Tl,  P2) 
dTi 


—  0,  for  n  ^  2 


It  will  be  observed  that  with  X  >  0,  m  >  0  and  >  0  equation  (6)  implies  the 
contagion  and  the  time  effect.  Pm,o(Pi,  P2)  represents  the  probability  of  avoiding 
accidents  in  (Pi,  P2).  When  P2  =  Pi,  then,  according  to  (2),  this  probability  is  unity. 
Equation  (6)  implies  that,  with  the  increase  of  P2  the  speed  of  falling  off  in  this 
probability  is  increased  with  the  increase  of  m  and  is  decreased  with  the  increase 
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of  Tu  Equations  (7)  imply  that  with  T2  Ti,  the  probability  of  more  than  one 
accident  in  (Ti,  T<f)  decreases  faster  than  the  difference  —  T^.  Also  if  ju  =  0,  then 
there  is  no  contagion.  If  =  0,  then  there  is  no  time  effect. 

The  reader  will  notice  that  (4)  and  (6)  imply 

/Q\  dQm,  o(Ti,  Tf)]  1  “h  ^ni 

~~~^2  ^ 

and  that  then  (1)  and  (7)  imply 

/Qx  1  (Ti,  Tf)  I  ^ 

^  ^  I  TT^  • 

Polyaks  original  scheme  was  considered  as  a  limiting  case  of  a  system  of  drawings 
from  an  urn  and  this  led  to  the  assumption  /z  =  y.  Also,  in  the  original  scheme  of 
Polya  d  —  1,  so  that  there  is  no  room  for  the  probabilities  Qm,n(T],  Tf). 

As  mentioned,  the  combination  of  postulates  Pi  through  P5  will  be  denoted  by 
(P)  and  described  as  the  generalized  Polya  contagious  scheme.  This  scheme  will  be 
contrasted  with  another  scheme  to  be  denoted  by  {N)  (connoting  Newbold)  which 
consists  of  postulates  Pi  through  P5  supplemented  by  Pe  and  P7  as  follows : 
Postulate  Pe.  Contagion  and  time  effect  are  absent j  so  that  fx  —  v  =  0. 

Postulate  P7.  The  parameter  X  in  (6),  (8)  and  (9)  is  a  particular  value  of  a  random 
variable  A  with  the  probability  density  function 

(10)  p/x)  =  ^  X“-'  e-'*"  for  0  <  X  , 

where  a  and  p  are  arbitrary  positive  numbers. 

It  will  be  seen  that  except  for  the  probability  B  of  surviving  an  accident,  model 
{N)  coincides  with  the  original  mixture-no  contagion-no  time-effect  model  of 
Greenwood,  Yule,  and  Newbold. 

Most  of  the  study  given  below  will  refer  to  model  (P)  and  it  appears  unnecessary 
to  complicate  the  formulae  with  constant  explicit  references  to  this  model.  Refer¬ 
ences  to  the  two  models  in  the  form  of  letters  P  or  N  behind  a  vertical  bar  will 
appear  only  in  cases  when  there  may  be  a  misunderstanding. 

3.  Problem  studied.  Considering  model  (P)  we  shall  visualize  s  +  1  consecutive 
periods  of  time,  the  ^th  period  beginning  with  U^i  ^  0  and  ending  with  ti>ti^ij  where 
^  =  0  and  4+1  =  +  “ .  As  before,  these  periods  of  time  will  be  open  on  the  left  and 
closed  on  the  right.  With  these  periods  of  time  we  shall  associate  s  +  2  random 
variables. 

The  random  variable  Z  is  defined  as  the  number  of  complete  periods  of  time  sur¬ 
vived  by  the  individual  /.  Thus  if  Z  =  0,  then  the  individual  I  meets  with  a  fatal 
accident  in  (0,  4),  etc.  If  Z  =  s  +  1  then  the  individual  I  survives  up  to  4+i  =  +  oo . 
Obviously  Z  is  capable  of  assuming  integer  values  from  zero  to  s  +  1. 

With  each  interval  (4-i,  4),  where  f  =  1,  2,  *  •  • ,  s  +  1,  we  associate  a  random  var¬ 
iable  Xi  defined  as  the  number  of  accidents  incurred  after  the  moment  4-1  and  up  to 
and  including  4,  which  the  individual  I  will  survive. 
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The  variables  X*  and  Z  are  interdependent.  Denote  by  k  the  value  assumed  by  Z. 
If  i  ^  k  then  X*  equals  the  number  of  accidents  incurred  by  I  in  all  of  which 

are  survived.  If  i  ^  k  +  I,  then  X*  equals  one  less  than  the  number  of  accidents  in¬ 
curred  by  I  in  (4,  4+i),  the  last  of  these  accidents  being  necessarily  fatal.  Finally,  if 
^  >  /c  +  1,  then  Xi  —  0. 

Our  problem  is  to  deduce  the  joint  probability  generating  function  of  the  variables 
Z,  Xi,  X2,  ■  •  ^X.+i. 

Whatever  be  the  random  variables  Fi,  F2,  •  •  •,  F^  capable  of  assuming  nonnega¬ 
tive  integer  values  and  whatever  be  the  hypotheses  we  shall  use  the  generic 
symbol 

(11)  Gy^  . y^  {ux,  W2,  •  •  ‘,Ur  \  H) 

=  E  S  ■  •  •  E  P{iYi  =  fci)  =  k,)  -  ■  ■  {Yr  =  kr)  I  H] 

^1=0  A:2=0  kj.=0 

to  denote  the  conditional  probability  generating  function  of  Fi,  F2,  •  •  •,  Fr,  given 
the  hypothesis  H.  Here  the  argument  Ui  corresponds  to  the  variable  F*  and  it  is  as¬ 
sumed  that  I  Wi|  ^  1,  for  ^  =  1,  2,  •  •  r.  For  the  variables  Z,  Xi,  •  •  •,  Xs+i,  the 
argument  of  the  probability  generating  function  which  corresponds  to  Z  will  be  de¬ 
noted  by  V  and  the  argument  corresponding  to  X*  by  ^  =  1,  2,  •  •  • ,  s  +  1.  With 
this  notation,  the  object  of  our  study  is  the  generating  function 

(12)  . (y,  Ml,  M2,  •  •  •,  m.h-i  1  P) 

implied  by  model  (P),  its  particular  case 

(13)  ^  Z.X^.X^,  .  .  .  ,-X'a+i  ’  *  'j  ^s+1  [  [m  =  0]  P) 

and  the  counterpart  of  (12)  implied  by  mixture-no  contagion-no  time-effect  model 
(N).  Obviously  (13)  is  a  function  of  X  and  we  have 

(14)  . X.+  .  (*'>'“!>  “2l  ■  ■  'jWs  +  ll-^) 

Gzjs: . («'>  •  •  •>  I  [m  =  >'  =  0]P)p^(X)  dk  . 

In  the  following  we  shall  have  occasion  to  use  the  fundamental  relation  between 
the  absolute  and  the  conditional  expectation,  familiar  for  a  long  time,  but  first 
rigorously  established  by  Kolmogoroff  [6].  Let  Fi,  F2,  •  •  •,  Fr  be  any  random 
variables  and  let  /  (2/1,  2/2,  *  *  * ,  2/r)  be  any  Borel  measurable  function  of  real  argu¬ 
ments  yi,  2/2,  ■  •  • ,  2/r.  Then 

(15)  P[/(Fi,  F2,  ■  •  •,  Fr)]  =  P{P[/(Fx,  F2,  •  •  •,  Fr)lFx,  F2,  •  •  •,  Fr_l]}  . 
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4,  Preliminary  formulae.  Applying  (15)  we  may  write 


(16)  G 


{Vy  Uly  W2,  *  •  •  ,  Us+l)  =  E 


s+ 1 


=  E 


s+ 1 


v^E  ( n  ’■  I  z 


1=1 


=  . I  ^  ^  m)P{Z  =  m} 

m=0 

Since  Z  =  m  implies  =  0  for  i  >  m  +  1,  the  conditional  probability  generating 
function  on  the  right  of  all  the  s  +  1  variables  Xi,  X2,  •  •  • ,  -X^s+i  reduces  to 

(17)  . (wi>  •  •  •,««.!  I  z  =  m) 

Our  first  step,  then,  will  be  to  provide  means  for  computing  the  probabilities 
P{Z  =  m)  and  the  conditional  probability  generating  functions  (17).  For  this  pur¬ 
pose  we  return  to  the  probabilities  P^ATi,  T2)  and  Qm,niTi,  T2)  introduced  in  section 
2.  Multiplying  them  by  ■u”  and  summing  for  n  from  zero  to  infinity,  we  get,  say 


00 


(18) 

gm{Ti,  T2,  u)  = 

23  «"  Pm.n(7’l,  Ti) 

and 

n=0 

(19) 

Am(7’i,  T2,  m)  = 

53  'W”  Qm,n{T ly  T2) 
n—0 

For  I  w  1  ^  1  both  series  converge  and  determine  Qm  and  hm  as  functions  of  w  which 
are  differentiable  for  |ii|  <1.  In  many  instances  below  the  value  of  u  will  be  imma¬ 
terial  and  in  these  cases,  to  simplify  the  notation,  we  shall  omit  u  from  the  symbol  of 
the  two  functions.  Also,  whenever  there  is  no  danger  of  misunderstanding,  we  shall 
occasionally  omit  all  three  arguments  and  write  simply  and  hm  for  the  left-hand 
sides  of  (18)  and  (19). 

In  order  to  determine  the  functions  g  and  h  we  proceed  in  the  familiar  manner 
[4  and  5]  and  write  down  the  relation  between  Pm,n(Ti,  T2)  and  Pm,n{Ti,  T2  +  r) 
where  r  >  0.  We  have 

(20)  Pm, o(El,  T2  +  r)  —  Pm,o{Ti,  T2)Pm,o{T2f  T2  +  r) 
and 

(21)  Pm,n{Tl,  T2+  r)  =  Pm,n{Tu  T  2)P  m.n  ,^{T  2,  T  2  +  t) 

=  Pm,n-l{T,y  T2)Pm.n^l,l{T2,  T 2  +  t)  +  o{t) 


for  n  >  0,  where,  owing  to  (7),  o(r)  decreases  when  r  0  and  the  rate  of  decrease  is 
faster  than  that  of  r.  Subtracting  Pm,(i{Tiy  T2)  from  both  sides  of  (20)  and  Pm,n{TiyT2) 
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from  both  sides  of  (21),  dividing  the  results  by  r,  passing  to  the  limit  as  r  ^  0,  and 
using  (6)  and  (9)  we  obtain 


(22) 

dPm,o{Tly 

dT2 

T2)  >  ^  "1”  T 

~  ^  1  +  xT/ 

and 

(23) 

11 

^  1  +  fim  +  fin  p 

1  +  vTi  ' 

.o(Ti,  T,) 


iTu  T,) 


+  ex  ^  - ^P«,„-i(Ti,  T,) . 

1  yj-  2 

Now  we  multiply  (23)  by  tt”,  sum  for  n  from  unity  to  +  oo ,  add  to  (22),  use  (18) 
and  obtain,  after  some  easy  algebra. 


(24)  (1  +  VT2)  ||r  +  XM^i(l  —  =  —  X(1  +  ixm)  (1  --  6u)gm  . 

Using  the  familiar  methods,  the  general  solution  of  this  partial  differential  equa¬ 
tion  is  easily  found  to  be 

(25)  ^(^2)] 
where,  to  simplify  the  formula  7  =  l//x  and,  generally, 

(26)  A{T)  =  (1  +  . 


Here  f(x)  stands  for  an  arbitrary  differentiable  function  of  the  argument  x.  This 
function  must  be  so  selected  that  (25)  coincide  with  gr„,(Ti,  T2).  For  this  purpose  we 
notice  that  the  substitution  T2  =  Ti  gives  g?m(Ti,  Ti)  ^  1  identically  in  u  and  Ti. 
Making  this  substitution  in  (25)  and  equating  the  result  to  unity  we  obtain  the 
condition  determining  the  function  f{x), 

(27)  /  MTO]  ^  • 

Now  substitute 


(28)  -----  A{Ti)  =  X 

and  solve  for  u 


(29) 


_A(^ 

x+  OAiTO  • 


Substituting  (29)  into  (27)  we  have 


fix)  = 


/  AH\)  Y 

\x  +  eAiTi)J 


(30) 
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Now  the  function  f{x)  is  determined.  In  order  to  obtain  we  substitute  (30) 
into  (25).  Easy  algebra  gives 


(31)  gmiT 


1,  Ti,  u)  =  ^ 


A{T,) 


eA{T,)  +  (1  -  e)A{Ti)  +  e[A{Tij  -  act 


)y+m 


=  [D{Tu  T,,  ,  say. 

Now  we  turn  to  the  function  hm{T\y  T^y  u)  generating  the  probabilities  Qm.n(^i,  7^2). 
Using  the  same  method  we  write 


(32)  Qm,n(Tij  7^2  +  r)  =  Qm,n(T]j  T2)  +  P m,n{T T^)  Q,7i+n,o(7^2,  7^2  +  r)  +  o(t) 
and  it  follows 


(33) 


(7^1  j  7^2)  /'I  ^\-y  1  “h  “b  f^n  p  ^  ^ 

W2  ^  Prn,n{T,yT2)  . 


Multiplying  this  result  by  summing  for  n  from  zero  to  infinity  and  using  (18) 
and  (19)  we  obtain 


(34) 


dT2 


(1 


-  e)x  Y 


+  fim 


+  vT, 


Qm  + 


flU 


1  +  vl 


_  ^gm 

\  du^ 


The  explicit  expression  of  the  derivative  of  hm  in  (34)  is  obtained  using  (31).  Since 
at  T2  =  T\y  the  value  of  hm  must  be  zero  identically  in  u,  an  easy  integration  gives 

(35)  T„  u)  =  [  1  -  T,)  ] 

where  gm  is  given  by  (31). 

Formulae  (31)  and  (35)  play  a  basic  role  in  our  further  study.  We  begin  by  using 
(31)  to  evaluate  the  frequency  function  of  the  random  variable  Z. 

5.  Probability  of  surviving  exactly  j  complete  periods  of  observation.  Referring 
to  the  definition  of  the  function  gm{Ti,  T2,  u)  and  of  the  probabilities  Pm.n(Ti,  T2),  it 
is  easy  to  see  that  gmiT^y  T2,  1)  represents  the  conditional  probability  that  the  indi¬ 
vidual  I  will  survive  at  least  up  to  and  including  T2,  given  that  he  was  alive  at  Ti 
and  that  up  to  the  moment  Ti  he  sustained  exactly  m  accidents.  In  particular,  we 
obtain  from  (31) 

(36)  ^o(0,  T,  1)  =  [0  +  A{T)  (1  -  e)r  =  ,  say, 

for  the  probability  that  the  individual  /,  alive  at  i  =  0  will  survive  at  least  up  to  and 
including  an  arbitrary  moment  T  ^  0. 

Now  return  to  the  random  variable  Z  defined  as  the  exact  number  of  complete 
time  intervals  (^t_i,  U)  which  the  individual  /,  alive  at  time  zero,  will  survive.  What¬ 
ever  the  nonnegative  integer  it  is  obvious  that 
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where,  to  simplify  the  notation,  Aj  =  A(tj).  It  will  be  noticed  that  the  conventional 
definition  of  =  +  oo  implies  As+i  =  +  oo  and,  therefore P{Z  s  +  1]  =0.  Now, 
the  probability  that  I  will  survive  exactly  j  complete  intervals  ti)  is 

(38)  P{Z=j}=P{Z^j}-P{Z^j  +  l] 


—  [0  Aj(l  —  ^) ]  —  [0  Ajj^i{l  —  B)]  ^ 


iovj  =  0,  1,  2,  •  •  • ,  5;  while 

(39)  p{Z  =  s+l)=0. 


6.  Probability  generating  function  of  Z,  Xi,  X2,  •  •  *,  X^+i  implied  by  model  (P). 

In  order  to  deduce  the  expression  for  the  probability  generating  function  desired  we 
first  establish  a  convenient  recurrence  formula. 

Let  uij  W2,  ■  •  • ,  Uj  be  any  nonnegative  integers.  Define  /So  =  0  and  generally 

(40)  Si  =  i:  n*  . 

Then  the  product 

(41)  n  Ps,.,.n,  (<••-- 

represents  the  probability  that  the  individual  J,  alive  at  time  zero,  will  survive  at 
least  up  to  and  including  tj,  and  that  in  the  interval  (ii-j,  U),  with  f  =  1,  2,  3,  •  •  * ,  y, 
he  will  survive  exactly  n*  accidents.  It  follows  that,  by  dividing  (41)  by  P{Z  j]  we 
shall  obtain  the  conditional  probability  of  the  compound  event 


(42)  (Xx  =  Til)  (X2  =  ^2)  •  •  •  (X,  =  n,) 
given  that  the  individual  I  survives  up  to  and  including  tj.  Thus 

(43)  . ^2?  •  *  '  )  Uj  \  Z  j)  =  n 

where  ^  symbolizes  j-f old  summation  for  Ui,  ^2,  •  •  • ,  Uj,  each  from  zero  to  infinity. 
Referring  to  (18)  and  (31)  we  see  that  the  last  of  these  summations  gives 

^  ( _ Aj.i _ \y+S:-i 

\0^y_i+(l  —  6)Aj-\-  B{A  j— A  —Uj)/ 


=  tj,  W,y)]Y+^i-i  . 
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We  have  then,  in  particular, 


(45) 


7  >  ,-1^  go(0,  h,  Ui)  ^  go(0,  <1,  Ui) 
^  P{Z^l]  g,{Q,h,\) 


6{A\  —  Ai)) 

8  +  {I  -  e)Ai 


Returning  now  to  (43),  if  we  multiply  both  sides  of  this  equation  by  P{Z'^j} 
and  use  (44),  we  obtain 


(46)  P{Z  ^  j]G 


»  t  Vj- 


(Ml,  M2, 


Z^J) 


=  [D(tj-i,  tj,  Uj)Y 


3-1 


n  (uiDy‘  Ps._„  1,  td 


where,  for  short,  D  =  tj,  uj)  and  the  summation  extends  over  all  combinations 

of  values  of  ni,rh,  •  •  • ,  n,_]  from  zero  to  infinity. 

It  is  easily  seen  from  (43)  that  the  expression  in  curved  brackets  in  (46)  is  equal  to 

(47)  P{Z  ^  i  -  . X,..  ■  •  -lUj-iD  I  Z  ^  i  -  1)  . 

This  establishes  the  recurrence  formula  sought,  namely 

(48)  . X,  (^i>  '“2,  •  •  •,  My  I  Z  ^  j) 


. Xj.,  (“i^>  •  •  •,  My-iZ)|Z^i-l)  . 


Using  this  formula  and  (37),  (45)  we  easily  obtain 


(49)  (wi.  M2  I  Z  ^  2) 

=  ^  Q  ((^1  ~  -do)!!  —  Ml)  +  (d.2  —  d.i)(l  —  M2)]| 

and  generally,  by  induction 


. ('Ml,  U2y  •  •  •,  Uj  \  Z  ^  j) 


1  + 


d 


e  +  (1  ~  d)A,- 


E(di 

i=l 


4i-l)  (1 


(50) 
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It  is  seen  that  the  conditional  distribution  of  Xi,  X2,  •  •  •,  Xj,  given  that  Z  j  is 
always  a  j-variate  negative  binomial.  We  propose  to  call  it  the  generalized  Polya 
multivariate  distribution. 

Now  we  can  use  the  same  method  to  compute  the  conditional  distribution  of 
given  that  Z  —  j.  To  do  so  we  turn  our  attention  to  (41)  and  notice 
that,  if  this  product  is  multiplied  by  Qsi,nui(^h  ^?+i)  then  the  result  will  equal  the  prob¬ 
ability  that  the  individual  /,  alive  at  zero  time,  will  sustain  and  survive  exactly  Ui 
accidents  in  U)  for  ^  =  1,  2,  •  ■  •  ,i  +  1  and  that  he  will  perish  at  the  (n;+i+l)st 
accident  between  tj  and  tj+i.  It  follows  that 

(51)  P{Z  =  i  ’  *  *7  \  ^  —  j) 

i=i  «;>1=0 

where  the  first  sum  extends  over  all  values  of  ni,  ^2,  •  •  • ,  Uj  from  zero  to  infinity. 
However,  referring  to  (19)  and  (35),  we  see  that  the  last  sum  coincides  with 


(52)  {tjy  'Wj+i)  —  ji^  __  ^  [1  Qs 


[1  {tjj  tj+if  wy+i)]  . 


Substituting  this  result  into  (51),  we  have 


(53)  P{Z  =  j}G 


^1.^2 . 


(uu  Ui,  •  •  •,  Uj+]  \  Z  =  j) 


1  - 


22  IT  ^  Sj  + 


Referring  again  to  (43)  and  (44),  we  obtain  easily 

(54)  P{Z  =  . ^2,  •  •  *,  wy+i  i  Z  =  j) 


1  -  d 

1  —  dUj+i 


P{Z^j}G 


X,.X, . X 


.  (ui,U2,  ■  ■  ■  ,Uj\Z^j) 


p{z^j+m 


X,,  .  .  .  ,Xy  +  i 


(lilf  •  *  Wj+1  Z^j+1) 


which  determines  the  generating  function  for  the  conditional  distribution  of 
Xi,  X2,  •  •  • ,  Xj+i  given  that  Z  =  j,  forj  =1,2,-  •  • ,  s. 
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Substituting  the  explicit  expressions  (50)  for  the  generating  functions  on  the  right 
of  (54)  and  using  (37)  and  (38)  we  obtain 

(5o)  P  {Z  =j]  . .  .Xy+i  ^2,  *  *  *j  nj+i  I  Z  =  j) 


where,  with  the  convention  uq  ^  1  this  formula  is  valid  for  j  =  0, 1,  2,  •  *  • ,  5. 
Finally,  we  have 


(56)  G, 


Z,Xi,X2 . X,+1  ,  Us+l) 


‘S’’  ([•  +  <1  -  f  (A,  -  ^,-0  (I  - «,)] 


[ 


y+1 

0  +  (1  -  e)Ai^i  +  9 'EUi-  4._i)  (1  -  w 

i=0 


7.  Probability  generating  function  of  Z,  Xi,  X2,  •  •  •,  X^+i  implied  by  model  (N). 
In  the  present  section  we  use  the  results  obtained  to  compute  (14).  For  this  purpose 
it  will  be  sufficient  to  evaluate  the  limit 

(57)  lim  P{Z  =  . . ,  ,Xy+i  (^1?  U2,  •  *  •,  Uj+i  \  Z  =  j) 

=  Fji\,  Ml,  M2,  •  •  •,  My+i;  ,  say  , 
and  to  perform  the  integration 

(58)  j  Fj{\,ui,Ui,  •  •  •  ,Uj+{ypJS)dX  =  Fj*{ui,Ui,  ■  ■  -jMy+i),  say. 

Then 

s 

(59)  ^Z,Xi,X2, .  .  .  ^2;  •  •  •,'Ws  +  i]iV)  —  ^2  ^  Fy*(wi,  U2,  *  ’  *,  Uj+i)  . 

j=o 

To  evaluate  the  limit  in  (57)  we  consider  first  the  expression 

d  +  a  ~  d)Aj  +  e  i  (^^  -  A,_i)  (1  -  " 

i=0 


(60) 


=  Bj  ,  say  . 
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We  have,  recalling  the  definition  of  Aj  in  (26), 


(61) 

lim  Bj  ~  ^  +  (1 

L 

(62) 

lim  Bj  —  exp  1  —  X 

/u-->0 

/  — X  (1  —  6)tj  +  6  ^2  ~  i'i-l)  (f  ~  f  • 

^  L-  i=n  -*  ^ 


It  follows,  then,  from  (56)  that 

(63)  Fj(\,ui,un^  •  •  ',Uj+i) 


1  —  Ouj+i 


X  [(1  ~  0)tj  +  0  E  (1  ” 


—  exp 

Easy  integration  gives 

(64)  Ff(Ui,U2,  •  ■  ’,Wy+i) 


X  (1  —  0)tj  +  l  -f-  0  ^  (ti  ti~ 
^  1=0 


l)  (1  -  M.)  jV 


=  ^  \  {[i  +  (1  -  +  r'e  Z  «.•  -  ti-i)  (1  -  Wi)] 

-  [i  +  (1  -  +  r'e  Z  Hi  -  ii-i)  (1  -  Wi)]  }  • 


Substituting  this  expression  for  F*  in  the  right  hand  side  of  (59)  we  get  the  desired 
generating  function. 

It  will  be  noticed  from  (59)  that  substituting  into  Fj*  unity  for  each  of  its  argu¬ 
ments,  we  obtain  the  probability  that  Z  =  ^  as  implied  by  model  (N).  Thus 

(65)  P{Z  =  j\N}  =  [!  +  (!-  d)p-%]-^  ^  [1  +  (1  - 
and  with  the  convention  =  +  «» ,  it  follows  that 

(66)  PlZ^j\N}  ==[!  +  (!-  0)r%]-“. 

Dividing  (64)  by  (65)  we  obtain  a  formula  determining  the  conditional  proba¬ 
bility  generating  function  of  Xi,  X2,  •  •  • ,  Xj+i,  given  that  in  the  interval  (^/,  i,*+i) 
the  individual  I  meets  with  a  fatal  accident, 
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1  -  d 
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(uij  •  •  •,  Uj+i  I  {Z  =  j),  N) 


. Xj, 

f 


1  -  9u 


y+1 


i=0 


y+i 


where  again  we  adopt  the  convention  Wo  =  1  so  that  (67)  is  valid  forj  =  1,  2,  •  •  • ,  s. 

When  comparing  the  models  (N)  and  (P),  formula  (67)  should  be  compared  with 
(55).  Both  generate  probabilities  of  the  various  combinations  of  values  of  the  Xi, 
X2,  •  •  • ,  Xj  and  Xj+i  subject  to  the  restriction  that  in  (iy,  ^y+i)  the  individual  I  meets 
with  a  fatal  accident.  In  order  to  obtain  for  the  model  (N)  the  counterpart  of  for¬ 
mula  (50)  we  notice  that 


(68) 


2:  lit,  ■  ■  -jM/t  +  i) 


k-l 


=  P{z  ^  . 

Upon  dividing  by  (66),  we  obtain 

(69)  . x,  («i>  •  •  •  >  I  (^  =  i)> 

r 

1  + 


(mi,  Ut,  ■  ■  ■,  Uj  I  {Z  ^  j),  AO  . 


1  +  (1  -  e)r‘0  S  ^  “ 


8.  Comparison  between  the  distributions  implied  by  models  (P)  and  {N).  The 
comparison  between  the  implications  of  models  (P)  and  {N)  made  thus  far,  [3,  4, 
and  5],  refer  to  the  distributions  of  Xj  with  9=1.  Using  (50)  and  (69),  we  have 

(70)  I  {9  =  1),  P)  =  [1  +  (Ai  -  1)  (1  -  ^^0] 
and 

(71)  (?^^(mi  !  (0  =  1),  AO  =  [1  +  r‘*!i(i  -  Ml)]-'  • 

It  is  seen  that,  with  a  =  y  and  ^~Hi  =  {Ax  —  1),  the  two  distributions  coincide  so 
that  no  amount  of  empirical  data  regarding  Xx  alone  can  afford  means  of  distinguish¬ 
ing  between  the  two  models. 

The  above  comparison  is  not  entirely  relevant,  since  it  is  frequently  impracticable 
to  ascertain  the  number  of  light  accidents  which  the  individuals  of  a  population  may 
have  incurred  prior  to  the  period  of  observation.  For  this  reason  it  is  doubtful 
whether  one  could  ever  obtain  data  which  could  serve  as  an  empirical  counterpart 


269 


Bates-Neyman:  Accident  Proneness.  II 

of  the  distribution  generated  by  (71).  The  most  one  can  hope  to  obtain  is  data 
regarding  individuals  who  were  exposed  to  unobserved  accidents  for  approximately 
the  same  period  of  time,  perhaps  for  a  long  time  ti,  and  then  were  subjected  to 
observation  during  one  or  more  subsequent  periods  (^i,  12),  (4,  4),  etc.  Such,  for 
example,  is  true  of  the  data  on  the  London  bus  drivers  discussed  in  Part  I  [1]. 
Before  being  employed  by  the  London  Transport  Board,  these  166  persons  were 
experienced  drivers  and  many  of  them  must  have  had  quite  a  few  accidents  which 
are  not  in  the  records.  However,  the  time  ti  that  elapsed  between  the  obtaining  of  a 
driver’s  license  and  the  beginning  of  the  employment  in  London  could  probably  be 
established  with  reasonable  accuracy.  Then,  the  statistics  compiled  for  those  drivers 
for  whom  tj  has  the  same  value  could  be  used  as  a  counterpart  of  the  theoretical 
distributions  of  the  random  variables  X2,  X3,  •  •  • .  We  will  compare  these  distribu¬ 
tions  for  the  two  models  (P)  and  (iV),  more  generally,  assuming  that  to  each  acci¬ 
dent  there  corresponds  a  fixed  probability  d  of  survival. 

Consider  first,  then,  formulas  (50)  and  (69),  with  Ui  =  1.  We  have 

(72)  =  1  +  dCj 

*’*'*’'  L  i=2 

and 

(73)  . ■  ■  ■,Ui\iZ^j),N)  =  1+^^9C* 

i“2 

where,  for  simplicity 

(74)  Cj=[e+{1-  e)A^]-^  and  Cf  =  [  1  +  (1  ~  . 

It  is  seen  that,  if  the  observations  are  limited  to  one  period  only,  e.g,,  from  to 
then  the  distributions  implied  by  the  two  models  are  single-variate  negative  bino¬ 
mials  with  two  parameters  each  and  are  indistinguishable. 

However,  if  the  observations  refer  to  two  or  more  equal  consecutive  intervals,  say 
(ti,  ti+i  =  ti  +  1)  for  f  =  1,  2,  •  •  • ,  y  —  1,  then  the  situation  is  changed  considerably. 
The  coefficients  of  the  binomials  (1  —  Ui)  in  (73)  are  all  equal  to 

(75) 

On  the  other  hand,  in  (72)  the  coefficient  of  the  binomial  I  —  Ui  is 

(76)  eCjiAi  -  A,_i) 

=  eCi  {[1  +  vti  +  v{i  -  1)]^"^''''  -  [1  +  vh  +  Ki  -  2)]  . 

In  order  that  the  two  distributions  be  forced  to  coincide  by  an  appropriate  choice 
of  the  parameters,  it  is  necessary  and  sufficient  that  (76)  be  independent  of  i,  that 
is  to  say,  that 

(77) 


=  V  , 
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In  other  words,  in  comparing  the  joint  distributions  in  models  (P)  and  (N)  of 
accidents  survived  in  two  or  more  equal  consecutive  periods  up  to  and  including, 
say,  tj,  for  those  individuals  who  are  known  to  be  alive  at  tj,  we  find  that  these  joint 
distributions  coincide  if  and  only  if  condition  (77)  is  met  by  the  parameters  X,  ju,  v 
of  model  (P).  A  comparison  of  formula  (55)  with  (67),  and  (38)  with  (65)  makes 
clear  that  the  same  conclusion  is  reached  when  one  compares  the  joint  distributions 
for  those  who  are  known  to  have  succumbed  to  a  fatal  accident  in  the  kih.  period, 
say,  with  =  1,  2,  •  •  ■ ,  j,  or  when  one  compares  the  two  distributions  of  the  num¬ 
ber  of  complete  periods  survived. 

In  principle,  of  course,  this  equality  may  be  satisfied,  and  then  the  two  schemes 
(P)  and  {N)  will  be  indistinguishable  no  matter  how  many  variables  X2,  X3,  •  •  • ,  X/, 
we  observe.  However,  the  satisfaction  of  the  equality  (77)  is  most  unlikely,  and  then 
the  multivariate  distribution  implied  by  the  model  (P)  will  be  different  from  that 
implied  by  model  (X).  At  any  rate,  should  the  empirical  distribution  of  X2,  X3,  •  •  • , 
Xj  indicate  the  inequality  of  the  coefficients  of  the  binomials  (1  “  uf)  then  this  is  an 
indication  in  favor  of  the  model  (P)  rather  than  the  model  (X). 

In  the  last  section  of  the  present  paper  we  study  the  possibility  of  identifying  the 
nature  of  contagion  (^^true”  or  ''false’O  using  a  different  set  of  observable  random 
variables. 

9.  Joint  distribution  of  the  number  of  light  and  the  number  of  severe  accidents. 

In  this  section  we  use  some  of  the  formulae  given  above  in  order  to  deduce  the  joint 
distributions  of  the  number  Y  of  light  accidents  incurred  in  one  period  of  time  of 
unit  length  and  the  number  X  of  severe  accidents  incurred  in  a  subsequent  period 
of  time,  also  of  unit  length,  as  implied  by  the  Greenwood-Yule-Newbold  model 
supplemented  by  the  fundamental  hypothesis  and  by  the  assumption  that  to  each 
severe  accident  there  corresponds  a  fixed  probability  d  of  surviving  it.  The  formulae 
deduced  here  are  given  without  proof  in  Part  I  of  the  present  paper. 

It  will  be  realized  that  the  process  of  determining  the  joint  distribution  of  X  and  Y 
is  exactly  similar  to  that  of  section  7. 

The  hypotheses  assumed  imply  that  for  a  given  individual  in  the  population,  with 
a  fixed  proneness  X  to  severe  accidents,  the  variables  X  and  Y  are  mutually  independ¬ 
ent  with  the  probability  generating  function  of  Y  given  by 

(78)  G^{v  I  X)  = 

where  A  is  the  modulus  of  frequency  of  the  light  accidents. 

As  to  the  variable  X,  we  shall  identify  it  with  Xi  of  the  set  of  random  variables 
discussed  in  section  7.  Substituting  u  for  Ui,  i  for  ti  and  unity  for  Ui  with  f  >  1  in 
formula  (63)  we  have 

(79)  P{Z  =  0)G^(m  i  {Z  =  0),  N)  =  Fo(X,  u)  =  . 

Similarly,  we  have  from  (63) 

(80)  P{Z  ^  l\G^{u\{,Z  ^l),N) 

s 

=  12FA\'U,U2,  ■  ■ 

i=l 
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Multiplying  (79)  by  (78)  we  obtain,  say 

(81)  $o(X,  u,  ?;)  =  ]  ~  f  I 

I  —  6u  { 

and  similarly,  multiplying  (80)  by  (78), 

(82)  $,(X,  u,  v)  =  . 

Multiplying  (81)  and  (82)  by  the  probability  density  function  of  A  and  integrating 
for  X  from  zero  to  infinity  we  obtain,  say 


(83)  $*(«,  v)  =  j— ^  ^«{[^+4(l_„)]-«_[0+l_e+0(l-^^)  +  A(l-^;)]-“) 
and 

(84)  4>*(z^,  v)  =  +  1  -  ^  +  0(1  -  ^i)  +  A(1  -  v)]-- 

respectively.  Now,  exactly  as  in  section  7, 

(85)  y(u,vl(Z  =  0),  N)  = 


and 


1  -  e  [g  +  ^(1  -  r))-»  -  \0  +  1  -  e  +  eg  -  u)  +  A(i  -  a)]-- 
1  -  9u  ~  /3-“  -  (d  +  1  -  0)-» 


(86)  = 


-[ 


|8  +  1  -  0 


(d  +  1  -  e)  +  e(i  -  m)  +  A(1  -  V) 


rl 


Formulae  (85)  and  (86)  are  the  two  probability  generating  functions  sought.  The 
probabilities  generated  by  (85)  form  the  theoretical  counterpart  of  the  accident 
statistics  for  those  individuals  of  the  population  who  meet  with  a  fatal  accident.  The 
probabilities  generated  by  (86)  correspond  to  the  frequency  distribution  of  accidents 
compiled  for  those  who  survive  the  second  period  of  observation. 

10.  More  hopeful  approach  to  the  problem  of  distinguishing  between  models  (P) 
and  (N).  As  was  shown  in  section  8,  the  joint  distribution  of  the  numbers  of  acci¬ 
dents  survived  in  several  consecutive  periods  of  observation  implied  by  model  (P) 
coincides  with  that  implied  by  model  (N)  only  in  the  improbable  particular  case 
when  Xju  =  v.  Thus,  given  a  substantial  number  of  observations  of  the  simultaneous 
values  of  the  random  variables  Xi,  X2,  •  •  • ,  X„  it  is  possible  to  subject  to  a  test  the 
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hypothesis  that  the  accidents  considered  are  noncontagious  and/or  that  there  is  no 
time  effect.  Details  of  such  tests  must  be  relegated  to  a  separate  paper.  However,  the 
authors  anticipate  that  the  power  of  the  test  contemplated  will  not  be  a  very  satis¬ 
factory  one.  On  the  other  hand,  it  seems  plausible  that  the  power  of  a  test  of  the 
same  hypotheses  may  be  much  better  if  these  hypotheses  are  tested  on  the  obser¬ 
vations,  not  of  the  variables  Xi,  X2,  ■  •  • ,  X^  considered  above,  but  of  time  intervals 
between  successive  accidents  incurred  by  particular  individuals.  A  complete  study 
of  this  problem  also  requires  more  space  than  can  be  given  in  the  present  paper. 
However,  it  seems  appropriate  to  include  the  present  section  outlining  the  new 
approach  in  relation  to  a  small  section  of  the  statistical  data  that  one  may  expect  to 
have  available,  namely,  in  relation  to  those  individuals  who,  during  the  period  of 
observation,  sustain  exactly  one  accident.  In  this  outline  we  shall  assume  6  =  1.  On 
the  other  hand,  it  appears  possible  to  liberalize  a  little  the  original  scheme  of  Polya 
by  not  insisting  that  the  dependence  of  the  derivatives  (6)  on  the  number  m  of 
previous  accidents  is  necessarily  linear  and  by  admitting  that  there  may  be  a  varia¬ 
tion  in  accident  proneness  from  one  individual  of  the  population  to  the  next. 

Consider  an  individual  /  and  assume  that  from  the  moment  ^  =  0  on,  he  is  exposed 
to  risk  of  accidents  of  a  particular  kind.  For  this  individual  we  shall  consider  proba¬ 
bilities  Pm,n{Th  7^2),  defined  in  section  2,  and  shall  assume  that  these  probabilities 
depend  on  the  number  m  of  accidents  sustained  up  to  and  including  moment  Tj  and 
also  on  the  value  of  Ti  but  not  on  the  precise  moments  when  the  previous  accident 
occurred.  Specifically,  we  shall  assume  that 


(87) 


dPm,n{Tiy  T2) 

dTi 


-  ^ 


7  2=^1 


1  +  vTi 


if  n  =  0 


,  if  n  —  1 


1  +vTi 

0  ,  if  n  >  1 


where  Xo,  Xi,  *  *  * ,  X^,  *  *  *  are  arbitrary  nonnegative  numbers  and  p  is  subject  to  the 
restriction  that  for  values  of  t  limited  to  the  period  of  observation  1  +  vt  >  0. 
Following  the  usual  procedure,  it  is  easy  to  find  that 


(88) 


PmATi,  T2) 


XmVl' 


and,  using  the  assumption  that  X^  X^+i  j 


(89)  PmATi,  T2) 


(1  +  vT,\ 

/  \  Xm+l/j'  -1 

( 1  +  pTA 

_  \1  +  PT2/ 

\i  +  vtJ  J 

Obviously,  Pm.o(7'i,  T2)  is  a  decreasing  function  of  X^.  Thus,  if  all  the  X^s  have  the 
same  value,  the  model  implies  the  absence  of  contagion  in  the  accidents.  If  the  X^s 
increase, 

(90) 


Xq  Xi  *  *  *  "^  Xm  Xm-(-3  *  *  * , 
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then  we  shall  speak  of  “regular  positive’’  contagion,  meaning  that  the  more  acci¬ 
dents  the  individual  had  in  the  past,  the  more  intense  is  the  i;isk  of  accidents  in  the 
future.  If  the  X’s  decrease, 

(91)  Xo  >  X]  >  •  •  •  !>  Xrrt  !>  Xot+]  !>  *  •  •  , 

then  we  shall  speak  of  “regular  negative”  contagion.  This  would  be  the  case  where 
previous  accidents  “teach”  the  individual  how  to  avoid  accidents  in  the  future. 

Finally  there  is  the  possibility  of  the  sequence  of  the  X’s  being  nonmonotone.  In 
this  case  we  shall  speak  of  “irregular  contagion.” 

Now  we  shall  assume  that  the  individual  I  is  observed  for  a  unit  of  time,  from  Ti 
to  Tj  +  1.  The  object  of  this  section  is  to  deduce  the  conditional  distribution  of  the 
random  variable  r  defined  as  the  time  between  the  beginning  of  the  period  of  obser¬ 
vation  Ti  and  the  moment  when  the  individual  I  sustains  an  accident,  given  that 
up  to  and  including  moment  T\  the  individual  sustained  exactly  m  accidents  and 
given  that  between  Tj  and  Ti  +  1  he  sustains  exactly  one  accident.  Obviously 

0  <  T  ^  1. 

For  this  purpose  we  compute  the  conditional  probability,  given  exactly  m  acci¬ 
dents  up  to  moment  Ti,  of  the  simultaneous  occurrence  of  two  events.  One  event  is 
that  during  the  period  of  observation  the  individual  I  sustains  exactly  one  accident. 
The  second  event  consists  in  r  not  exceeding  an  arbitrary  positive  number  ^  ^  1. 
The  conditional  probability  just  defined  coincides  with  the  conditional  probability, 
given  exactly  m  accidents  up  to  moment  Ti,  that  between  Ti  and  +  t  the  indi¬ 
vidual  will  have  exactly  one  accident  and  that  between  Ti  +  t  and  Ti  +  1  he  will 
have  no  accidents.  Thus,  this  probability  is  easy  to  obtain  from  (88)  and  (89), 


(92)  P„i,i(Ti,  T\  +  t) I  Ti  +  1) 

J 


Xm+l  —  l(0^  + 


(^) 

\a  +  v) 


Xm+  l/ V 


Where,  for  the  sake  of  simplicity 

(93)  + 

Dividing  (92)  by  the  conditional  probability  of  exactly  one  accident  between  T\ 
and  Ti  +  1,  we  obtain  the  conditional  probability  of  r  ^  i,  given  that  up  to  Tj  the 
individual  I  had  exactly  m  accidents  and  that  in  (Tj,  Ti  +  1)  he  sustains  exactly  one 
of  them.  This  probability  is,  then,  the  conditional  distribution  function  of  r,  say 


Ft  (t  I  v) 


Pm,\{Ti,  Ti  -|-  t)  Pm  +  l.Q(Tl  +  Tl  +  1) 

P«..i(Ti,  Pi  +  1) 


1 

1 


(94) 
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where,  for  the  sake  of  brevity,  =  Xm+i  —  Xm.  The  corresponding  probability 
density  function  is,  say 


(95) 


I  V)  = 


We  are  particularly  interested  in  the  following  three  special  cases  obtainable  from 

(95)  by  simple  passages  to  the  limit. 

i)  If  ^|/m  =  0,  but  V  is  unspecified,  then  the  mth  accident  is  ^^noncontagious’’  and 
we  have,  say 

(96)  [t  I  {ipm  =  0),  v]  =  - - — - - —  . 

(a  +  ut)  log  +  -j 


li)  If  =  0  but  y}/m  is  unspecified,  then  there  is  no  time  effect  but  there  may  be 
contagion  and 

(97)  vj.t  i  (^  =  0)]  =  . 

Hi)  Finally,  when  both  =  0  and  v  =  0,  then  we  have  the  no  contagion-no  time- 
effect  case  and 

(98)  p,  [t  I  (^„  =  0),  (.  =  0)]  =  1  . 

All  five  formulae  (94)  through  (98)  are  given  for  0  <  i  ^  1.  They  refer  to  a  par¬ 
ticular  individual  with  a  fixed  number  m  of  previous  accidents  and  with  fixed  and 
it  will  be  seen  that  their  use  is  likely  to  give  a  substantial  insight  into  the  mechanism 
of  accident  proneness.  The  particularly  interesting  point  is  that,  at  least  in  some 
respects,  the  effect  of  variation  from  one  individual  of  the  population  to  another  is 
now  divided  from  contagion  and  time  effect.  Thus,  if  accident  proneness  conforms 
exactly  with  the  no  contagion-no  time-effect  mixture  model  of  Greenwood,  Yule, 
and  Newbold,  whether  including  the  particular  postulated  distribution  function  of  A 
or  not,  then  the  time  intervals  r  observed  for  arbitrary  individuals  of  the  population 
will  be  uniformly  distributed  between  zero  and  unity  as  implied  by  (98).  Any  de¬ 
parture  from  this  distribution  is,  then,  an  evidence  of  either  time  effect  or  contagion 
or  both.  Furthermore,  the  distribution  (95)  applicable  in  the  general  case  coincides 
with  (98)  only  when  ypm  =  v. 

The  identity,  with  respect  to  any  characteristic,  of  all  individuals  of  a  living 
population  is  always  rather  improbable.  In  particular  it  may  be  taken  for  granted 
that  m  will  vary  from  one  individual  to  another.  Consequently,  if  by  and  large  the 
accidents  studied  are  subject  to  contagion  and/or  to  time  effect,  it  is  most  likely 
that  at  least  for  some  individuals  of  the  population  the  equality  =  v  will  not  be 
satisfied  and  that,  therefore,  the  study  of  the  empirical  distribution  of  r  will  indicate 
the  true  nature  of  the  machinery  of  accident  proneness.  This  is  particularly  probable 
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in  the  two  “regular”  cases,  in  which  the  sequence  of  the  lambdas  is  monotone. 
However,  it  may  be  hoped  that  a  study  of  time  intervals  for  individuals  incurring 
two  or  more  accidents  during  the  period  of  observations  will  throw  some  light  also 
on  the  irregular  case  in  which  owing  to  the  variation  in  m,  is  positive  for  some 
individuals  of  the  population  and  negative  for  others. 
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